Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kind integrated test workflow MVP #1269

Closed
wants to merge 2 commits into from

Conversation

BenTheElder
Copy link
Member

implements #986

@BenTheElder
Copy link
Member Author

/hold

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 18, 2020
@k8s-ci-robot k8s-ci-robot requested review from aojea and munnerz January 18, 2020 01:02
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 18, 2020
#
export KIND_BUILD_TYPE="${BUILD_TYPE:-bazel}"
export KIND_TEST_FOCUS_OVERRIDE="${FOCUS:-}"
export KIND_TEST_SKIP_OVERRIDE="${SKIP:-}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future we will stop doing this and just control the lists in code centrally (IE kind test presubmit will have a fixed list, with the override being meant for local prototyping hacks, not CI)

@BenTheElder BenTheElder force-pushed the test-commands branch 2 times, most recently from 31dece3 to 7e40db8 Compare January 18, 2020 01:16
// In Kubernetes CI we don't have real IPv6 connectivity so we need to
// employ some work arounds when testing IPv6 clusters
// We do this all environments for consistency.
if config.Networking.IPFamily == v1alpha4.IPv6Family {
Copy link
Contributor

@aojea aojea Jan 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a TODO for DualStack
config.Networking.IPFamily == v1alpha4.IPv6Family || config.Networking.IPFamily == v1alpha4.DualStackFamily

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't realy TODO constants that don't exist yet, dual stack my not be a family, it will probably be many families?
dual stack as an API is a TODO.

also, there is validation long before this code rune that errors if not one of ipv4 / ipv6 specifically to handle the fact that this command is not aware of any other families currently

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

skip = strings.Join([]string{`\[Serial\]`, skip}, "|")
}
cmd := exec.Command(
"./hack/ginkgo-e2e.sh",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need the ./hack/ginkgo-e2e.sh bash script?
a quick look to the script seems that we can avoid most of the checks and run directly the e2e.testbinary,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's orthogonal to this PR. but in any case I'm not convinced that this is useful right now, it will skew versus other test callers and this script also does helpful things like locating the e2e binary in the many possible build paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

node := nodes[0]

// "fix" coreDNS configmap
// TODO: refactor this to not be a shell script.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1000
This can also be a good /help wanted issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe? probably not. the replacements need to be equivalent. i'm not sure how many casual contributors know sed / shell and go string manipulation deeply 🤷‍♂

Args: cobra.NoArgs,
// TODO(bentheelder): more detailed usage
Use: "test",
Short: "Tests one of [presubmit, conformance]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how can I run my custom tests? this is specially useful to troubleshoot e2e tests or to create new ones, where you only want to run 1 or a small number of tests.
can we add one subcommand that allows to specify my custom FOCUS and SKIP fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • see the e2e.sh for details on overriding these
  • it's not intended to run your tests. it's intended to run the standard flows in a way a non-power-user can trivially mimic by invoking the command(s). you as a power user can still write a little script with your flows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've found my answer, just let me rephrase, how can I specify FOCUS and SKIP with this change?
I have to use the env variables KIND_TEST_SKIP_OVERRIDE and KIND_TEST_FOCUS_OVERRIDE right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

those are temporary for transitioning the prowjobs
you can run e2e.test / ginkgo e2e.sh yourself currently.

// don't use the horrid e2e.test """provider""" concept
"--provider=skeleton",
// need to inform tests of worker count
"--num-nodes=", strconv.Itoa(numNodesForE2E),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the jobs are failing with

I0118 01:25:13.095] [1] invalid value "" for flag -num-nodes: strconv.ParseInt: parsing "": invalid syntax
I0118 01:25:13.095] [1] Usage of /go/src/k8s.io/kubernetes/bazel-bin/test/e2e/e2e.test:
🤷‍♂

Copy link
Contributor

@aojea aojea Jan 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, there is the problem, is adding a space

W0118 01:25:14.472] Call stack:
W0118 01:25:14.472]   1: ./hack/ginkgo-e2e.sh:143 main(...)
W0118 01:25:14.472] Exiting with status 1
W0118 01:25:22.159] ERROR: command "./hack/ginkgo-e2e.sh --provider=skeleton --num-nodes= 2 --report-dir /workspace/_artifacts --disable-log-dump=true --ginkgo.focus '' --ginkgo.skip '\[Serial\]|'" failed with error: exit status 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, the space is a rendering of passing this to the shell.
the problem is that we're passing it as two arguments, but the first contains =

marked the PR WIP, I haven't even run this yet. by inspection it's mostly done but it's still a prototype. I'm going to rework things a bit before un-WIP-ing it.

@BenTheElder BenTheElder changed the title kind test MVP [WIP] kind integrated test workflow MVP Jan 18, 2020
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 18, 2020
@BenTheElder
Copy link
Member Author

/test pull-kind-e2e-kubernetes

@BenTheElder
Copy link
Member Author

e2e is failing because kubectl isn't being found on the path when building with bazel + running ginkgo-e2e.sh. probably not actually fixed upstream :/
will settle on an option for this tomorrow.

return
} else if err2 != nil {
err = err2
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only if err3 != nil or we alway assign err3 to err at this point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't care at this point, because if err3 is nil at this point so are err2 and err /shrug

// build test binaries
switch buildType {
case "bazel":
if err := exec.InheritOutput(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this writes to the logger, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. this wires it to stdout / stderr.

this respecting streams is a broad TODO, but it's not super important for cmd/ at the moment


"github.com/spf13/cobra"

"sigs.k8s.io/kind/pkg/apis/config/v1alpha4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we use the internal APIs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. internal apis are for:

  • translating between versions
  • allowing runtime code implementing the API to handle a single version, possibly with extra (computed?) data that is not part of the public API.

the CLI is a client of the runtime.

}

func enableIPv6Workarounds(provider *cluster.Provider, name string) error {
// get a control plane node to use run kubectl from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should check that ipv6 is enabled? or at least ipv6 forwarding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the caller is responsible for having ipv6 enabled docker. we're not going to muck with the host in kind

if err != nil {
return errors.Wrap(err, "failed to get nodes for coreDNS IPv6 CI fixes")
}
nodes, err = nodeutils.ControlPlaneNodes(nodes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, a different name to the assigned variable to distinguish from the previous one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's anti-helpful here. we don't want to retain and accidentally use the unfiltered node list.
we could break this out into a helper method but it's just a few lines so that's not super helpful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah, no problem at all, is just there are at least 1 or 2 places more doing this that way, that's why I added the nit 😄

@BenTheElder
Copy link
Member Author

/retest
/hold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 30, 2020
@k8s-ci-robot
Copy link
Contributor

@BenTheElder: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 10, 2020
@aojea
Copy link
Contributor

aojea commented Mar 3, 2020

Do you think that is possible to implement 2 executions of the e2e.test?
One using parallel without the serial tests and other only with the serial tests and aggregating the results of both runs

@BenTheElder
Copy link
Member Author

BenTheElder commented Mar 4, 2020 via email

@BenTheElder BenTheElder mentioned this pull request May 18, 2020
@BenTheElder
Copy link
Member Author

okaaay, let's see about getting this up to date again

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 17, 2020
@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 16, 2020
@BenTheElder BenTheElder removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 25, 2020
@kubernetes-sigs kubernetes-sigs deleted a comment from fejta-bot Sep 25, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 24, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 23, 2021
@k8s-ci-robot
Copy link
Contributor

@BenTheElder: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kind-conformance-parallel-ipv6 e445481 link /test pull-kind-conformance-parallel-ipv6
pull-kind-conformance-parallel-1-17 e445481 link /test pull-kind-conformance-parallel-1-17
pull-kind-e2e-kubernetes-1-18 e445481 link /test pull-kind-e2e-kubernetes-1-18
pull-kind-test e445481 link /test pull-kind-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants